# Agent-Environment Alignment via Automated Interface Generation

This repository contains the full source code and experiment logs for our ICLR 2026 paper:
**"Agent-Environment Alignment via Automated Interface Generation"**

## Overview

We present **ALIGN**, a framework that automatically generates aligned interfaces to alleviate agent-environment misalignment, significantly improving agent performance across a range of interactive decision-making tasks.

This repository includes all source code, prompt templates, and experimental logs to reproduce our results across four benchmarks:

* **ALFWorld** (`alfworld`)
* **ScienceWorld** (`scienceworld`)
* **WebShop** (`webshop`)
* **M³ToolEval** (`m3tool`)

## Repository Structure

Each benchmark directory (e.g., `alfworld/`, `scienceworld/`, etc.) contains the following components:

> **Note**:
> `[METHOD]` $\in$ {`vanilla`, `react`, `sc`, `refine`, `planning`}
> `[MODEL]` $\in$ {`Llama31_8BInstruct`, `Llama33_70BInstruct`, `Qwen2.514BInstruct`}

### Core Files

* `main.py`
  Entry script to configure and launch experiments.

* `analysis_agent.py`
  Core implementation of the **Analyzer** module for misalignment detection.

* `analysis_agent_prompt_[METHOD].py`
  Prompt templates used by Analyzer for the specified agent method.

* `optimization_agent.py`
  Core implementation of the **Optimizer** module for interface generation.

* `optimization_agent_prompt_[METHOD].py`
  Prompt templates used by Optimizer for the specified agent method.

* `experiment_[METHOD].py`
  Agent implementation and experiment logic for each agent method.

* `env_simulator_[METHOD].py`
  Simulator environment used for interface verification with Analyzer/Optimizer.

* `interface_ini.py`
  Initial interface (without ALIGN).

* `interface_[METHOD].py`
  Final ALIGN-generated interfaces for each agent method.

### Logs

All experiment results are stored in the `logs/` directory under each benchmark. This includes:

* `baseline_[METHOD]`
  Experiments **w/o** interface enhancement (baseline).

* `final_[METHOD]`
  Experiments **w/** ALIGN-enhanced interface (main result).

* `vanilla_to_[METHOD]`
  Generalization experiments: from vanilla agent to other agent methods.

* `[MODEL]_[METHOD]_baseline` / `[MODEL]_[METHOD]_alignment`
  Cross-model generalization from Qwen2.5-7B-Instruct to other LLMs.

* `noinformation_Qwen257BInstruct_[METHOD]_alignment`
  **Ablation study**: removing the `InferRules` module.

* `nointeraction_Qwen257BInstruct_[METHOD]_alignment`
  **Ablation study**: removing the `WrapStep` module.

## Reproducing Results

Please refer to each `main.py` and `[METHOD]`-specific experiment script to launch experiments. All hyperparameters and task splits follow the configurations described in our paper.
